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Abstract. Global Value Numbering (GVN) is an important static anal¬ 
ysis to detect equivalent expressions in a program. We present an itera¬ 
tive data-flow analysis GVN algorithm in SSA for the purpose of detect¬ 
ing total redundancies. The central challenge is defining a join operation 
to detect equivalences at a join point in polynomial time such that later 
occurrences of redundant expressions could be detected. For this pur¬ 
pose, we introduce the novel concept of value (j>-function. We claim the 
algorithm is precise and takes only polynomial time. 
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1 Introduction 

Global Value Numbering is an important static analysis to detect equivalent 
expressions in a program. Equivalences are detected by assigning value numbers 
to expressions. Two expressions are assigned the same value number if they 
could be detected as equivalent. The seminal work on GVN by Kildall [T] detects 
all Herbrand, equivalences [2] in non-SSA form of programs using the powerful 
concept of structuring but takes exponential time. Efforts were made to improve 
on efficiency in detecting equivalences. However the algorithms are either as 
precise as Kildall’s [3] or efficient nil m but not both. 

The strive for combining precision with efficiency has motivated our work in 
this area. We propose an iterative data-flow analysis GVN algorithm to detect 
redundancies in SSA form of programs that is precise as Kildall’s and efficient 
(i.e. take only polynomial time). As in a data-flow analysis problem, the central 
challenge is to define a join operation to detect all equivalences at a join point in 
polynomial time such that any later occurrences of redundant expressions could 
be detected. We introduce the novel concept of value 4>-function for this purpose. 

2 Terminology 

Program Representation Input to our algorithm is the Control Flow Graph 
(CFG) representation of a program in SSA. The graph has empty entry and 
exit blocks. Other blocks contain assignment statements of the form x = e, 


where e is an expression which is either a constant, a variable, or of the form 
x © y such that x and y are constants or variables and © is a generic binary op¬ 
erator. An expression can also be of the form <j>k(x , y), called <p-functions, where 
x and y are variables and k is the block in which it appears. We assume a block 
can have at most two predecessors and a block with exactly two predecessors 
is called join block. The input and output points of a block are called in and 
out points, respectively, of the block. The in point of a join block is called join 
point. We may omit the subscript k in (f>k when the join block is clear from the 
context. In the CFGs we draw, (^-functions appear in join blocks. But for clarity 
in explaining some of our concepts we assume ^-functions are transformed to 
copy statements and appended to appropriate predecessors of the join block. 

Equivalence Two expressions e\ and e-i are equivalent , denoted e\ = 62, if they 
will have the same value whenever they are executed. Two expressions in a path 
are said to be equivalent in the path if they are equivalent in that path. We 
detect only Herbrand equivalences [ 2 j which is equivalence among expressions 
with same operators and corresponding operands being equivalent. 


3 Basic Concept 

Our main goal is to detect equivalences with a view to detecting redundancies in 
a program in polynomial time. We introduce the concept of value (f>-function for 
the purpose which is explained in this section followed by our method to detect 
redundancies. 


3.1 Value ^-function 

Consider the simple code segment in Fig. 1 (a). Here irrespective of the path 
taken x± +y± is equivalent to a\ + b\. In terms of the variables being assigned to, 
we can say z\ is equivalent to same variable ci. Now consider the code segment 



(a) Linear program (b) Program with branches 


Fig. 1 : Concept of value ^-function 


in Fig. 1 (b). Depending on the path taken expression X3 + t/3 is equivalent to 
either x\ + y\ or X2 + yi- In terms of the variables being assigned to, we can 
say U13 is equivalent to merge of different variables - pi and q-2 . Inspired by the 








notion of (/(-function, we can say W3 is equivalent to <p(pi, <12)■ This notion of 
(/(-function is an extended notion of (/(-function as seen in the literature. In the 
literature, a (/(-function has different subscripted versions of the same non-SSA 
variable, say X2). To express such equivalences, we introduce the concept 
of value <fi-function similar to the concept of value expression [ 3 ]. 

Value (f>-function A value cf>-function is an abstraction of a set of equivalent 
(/>-functions (including the extended notion of (/>-function). Let Vi, Vj be value 
numbers and vpf be a value (/(-function. Then f>k{vi,Vj ), <fik{vpfvj), <fk(vi , vpf), 
and cf>k(vpf ., vpf) are value <f>-functions. 

Partition A partition at a point represents equivalences that hold in the paths to 
the point. An equivalence class in the partition has a value number and elements 
like variables, constant, and value expression. It is also annotated with a value 
(/(-function when necessary. The notation for a partition is similar to that in [ 3 ] 
except that a class can be annotated with value (/(-function. 


4 Proposed Method 


Using the concept of value (/(-function we propose an iterative data-flow analysis 
algorithm to compute equivalences at each point in the program. The two main 
tasks in this algorithm are join operation and transfer function: 


4.1 Join operation. 

A join operation detects equivalences that are common in all paths to a join 
point. The join is conceptually a class-wise intersection of input partitions. Let 
Ci and C2 be two classes, one from each input partition. If the classes have 
same value number then the resulting class C is intersection of Ci and C2. If the 
classes have different value numbers, say v\ and V2 respectively, then common 
equivalences are found by intersection of Ci and C2. The common equivalences 
obtained are actually a merge of different variables, which is indicated by the 
difference in value numbers and hence class C is annotated with (j){v\, vf). Now if 
the classes have different value expressions, say v m + v n and v p + v q respectively, 
the value expressions may be merged to form a resultant value expression say 
Vi + Vj. Value expressions v m + v n and v p + v q are merged to get Vi + Vj by 
recursively merging classes of v m and v p to get class of Vi and classes of v n 
and v q to get class of v 0 [ 3 ]. But merging the value expressions can lead to 
exponential growth of resulting partition [ 5 ], We do not merge different value 
expressions now instead merge them at a point where an expression represented 
by Vi + Vj actually occurs in the program. This merge is achieved simply by 
detecting equivalence of Vi + Vj with </>(vi, vf) and is done during application of 
transfer function. 


Example Let us now consolidate the concept of join using an example. Consider 
the case of applying join on partitions P\ = {vi,xi,xa\v2, 2/1, 2/3, ui + l|f3, z\, Z3} 
and P2 = {iq, X2, X3IU5,3/2,2/3 1 65 %2, Z3, iq + l}. In the classes with value numbers 
t>i in Pi and V4 in P2 there is only one common variable X3 and this will appear 
in a class in the resulting partition P3. Since the two classes in Pi and P2 have 
different value numbers v\ and iq, respectively, the resulting class is annotated 
with value </>-function <j){y1,1)4). The class is assigned a new value number, say 
v 7 . The resulting class is \v-j,X3 : tq)|. Now consider the classes with value 

numbers Vi in Pi and vg in P2. There are no obvious common equivalences in the 
classes and we don’t merge the different value expressions now. Hence no new 
class is created. Similar strategies are adopted in detecting common equivalences 
in other pairs of classes one each from Pi and P2. The resulting partition P3 is 
{v 7 ,x 3 : <j>{vi,V 4 )\v s ,y 3 ■ <t>{v2,v 5 )\v 9 ,z 3 : (/)(v 3 ,Ve)}. 


Join(Pi,P 2 ) 

1 P = {} 

2 for each pair of classes ( 7 * £ Pi and Cj £ P2 

3 Ck = Ci (~l Cj II set intersection 

4 if Ck ^ {} and Ck does not have value number 

5 thenCfc = Ck U {vk, 4>b{ui, Vj)} II Vk is new value number 

II Vi £ Ci, Vj £ Cj, b is join block 

6 P = P U Ct II Ignore when Ck is empty 

return P 

Note: We define special partition T such that Join(T,P) = P = Join(T,P). 
We assume (^functions in a join block are transformed to copies and appended 
to appropriate predecessors of join block. 


4.2 Transfer Function. 

Given a partition PIN S , that represents equivalences at in point of a statement 
s : x — e the transfer function computes equivalences at its out point, denoted 
POUT s . Let ve be the value expression of e computed using PIN S . If ve is 
present in a class in PIN S , then x is just inserted into corresponding class in 
POUT s . Otherwise the transfer function checks whether e could be expressed as 
a merge of variables represented by a value 0-function vpf (as illustrated below). 
If it is present in a class in PIN S then x, ve are inserted into corresponding class 
in POUT s . Else a new class is created in POUT s with new value number and 
x, ve, vpf are inserted into it. 

For an example, consider processing the statement W3 = X3 + 2/3 as shown in 
code segment in Fig. 2. Since value expression v 7 + vg of X3 + 2/3 is not in PIN3, 
the transfer function proceeds to check whether X3 + 2/3 is actually a merge of 
variables as follows: 

x 3 +y 3 = v 7 +v 8 = </>(v lt v 4 ) + 4>{v2,v 5 ) = 4 >(v 1 + ^2,^4 +^5) = 4>{V3,V 6 )- 
This implies £3 + 2/3 is actually a merge of variables, herepi and <72 ■ Since neither 



PIN, = {} 


PINi = {} 



Fig. 2 : Concept of Transfer Function 


v 7 + t >8 nor 4>{vg,vg) are present in PIN. 3, a new class is created in POUTg with 
new value number say vq and W3, v 7 + vg, and <f>(vg,ve) are inserted into it. 
The classes in PINg are inserted as such into POUTg. The resulting partition 
POUTg is {v 7 ,xg : (j){v\,Vi)\vg,yg : (j)(v 2 ,v 5 )\v 9 ,wg,v 7 + v 8 : 4>(v 3 ,v e )}. 


transferFunction(:t = e,PIN a ) 

1 POUT a = PIN S 

2 Ci = Ci — {x} II x € Ci, a class in POUT s 

3 ve = VALUEExPR(e) 

4 vpf = VALUEPHiFUNC(ne,P/7V s ) // can be NULL 

5 if ve or vpf is in a class Ci in POUT s H ignore vpf when NULL 

6 then Ci = Ci U {a;, i>e} II set union 

7 else POUT s = POUT s U {v n ,x,ve : vpf} II v n is new value number 

return POUT s 

The VALUePhiFunc is a recursive algorithm to compute value ^-function cor¬ 
responding to input value expression when possible else it returns NULL. 


4.3 Detect Redundancies. 

Given partition POUT at out of statement x = e, expression e is detected to be 
redundant if there exists a variable in the class of x in POUT, other than x, or 
the class of x in POUT is annotated with value (^-function. In the example code 
in Fig. 2 , consider the case of checking whether xg + yg in the last statement 
u ’3 = xg + yg is redundant. In the class of Wg in POUTg (computed in previous 
subsection) there are no variables other than wg. However the class is annotated 
with a value (f- function. Hence the expression Xg+yg is detected to be redundant. 

Theorem 1. Two program expressions are equivalent at a point iff the iterative 
data-flow analysis algorithm detects their equivalence. 


Proof. This can be proved by induction on the length of a path in a program. □ 





5 Complexity Analysis 


Let there be n expressions in a program. The two main operations in this itera¬ 
tive algorithm are join and transfer function. By definitions of Join and TRANS- 
ferFunction a partition can have 0 (n ) classes. If there are j join points, the 
total time taken by all the join operations in an iteration is O(n.j). The transfer 
function involves constructing and then looking up for value expression or value 
^-function in the input partition. The transfer function of a statement takes 
0 (n.j) time. In an iteration total time taken by transfer functions is 0 (n 2 .j). 
Thus the time taken by all the joins and transfer functions in an iteration is 
0 (n 2 .j). In the worst case the iterative analysis takes n iterations and hence the 
total time taken by the analysis is 0 (n 3 .j). 

6 Conclusion 

We presented GVN algorithm using the novel concept of value 0 -function which 
made the algorithm precise and efficient. 
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